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Abstract 

Recent state and national education reform initiatives have focused on outcomes and quantifiable 
data. The measurement of educational indicators is playing a central role in the current wave of 
reform as various groups seek to produce policy-relevant information on the educational 
performance and status of children and youth in our schools. This report summarizes activities of 
the National Center on Educational Outcomes (NCEO) directed at producing a report on the status 
of students with disabilities from the secondary analysis of state collected achievement data. 
Although more than half of the 50 states reported that large-scale achievement data were available 
for some students with disabilities, potentially usable data were obtained from only six states. 
Numerous difficulties were encountered in attempts to collect and aggregate state achievement data 
on students with disabilities. It was concluded that it is currendy not possible to produce a 
synthesis report on the achievement status of students with disabilities from aggregated state data 
bases. Recommendations arc presented for improving the probability of conducting such analyses 
in the future. 
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Secondary Analysis of State Assessment Data: 
Why We Can't Say Much About Students with Disabilities 

The National Center on Educational Outcomes (NCEO) for students with disabilities was 
established in October, 1990 to work with state departments of education, national policy-making 
groups, and others to facilitate and enrich the development and use of indicators of educational 
outcomes for students with disabilities. It is believed that responsible use of indicators will enable 
students with disabilities to achieve better results from their educational experiences. 

One of the four major strategic goals of the NCEO is to enhance the availability and use of 
outcomes information in decision making at state and federal levels. A variety of activities arc 
subsumed under this broad goal. Two activities focus on determining the feasibility of extracting 
quality and credible policy-relevant information on the educational status and performance of 
students with disabilities from state and national data collection programs (McGrew, Spiegel, 
Thurlow, Ysseldyke, Bruininks, & Shriner, 1992). The primary goal is to produce synthesis 
reports that describe the educational outcomes of children and youth with disabilities based on the 
secondary analysis of data in existing state and national data collection programs. A complete 
description of the NCEO's activities in the area of secondary analysis of data from state and 
national data collection programs can be found in a separate NCEO report (McGrew, Spiegel, 
Thurlow, Ysseldyke, Bruininks, & Shriner, 1992). 

NCEO findings related to the analysis of national data collection programs (e.g., the 
National Assessment of Educational Ptogress-NAEP) have been reported (McGrew, Algozzine, 
Spiegel, Thurlow, & Ysseldyke, 1993; McGrew, Spiegel, Thurlow, & Kim, 1994; McGrew, 
Thurlow, Shriner, & Spiegel, 1992; McGrew, Thurlow, & Spiegel, 1993). Our review showed 
that although many important outcome indicators for individuals with disabilities are included in 
existing national data collection programs, secondary analyses of the data gathered by these 
programs is limited by the significant exclusion of students with disabilities and the variable 
identification of these individuals in the data bases. 
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This report describes the NCEO's efforts to secure and conduct secondary analyses of data 
collected by large-scale state assessment programs. 

The Current Context: Measurement-Driven Education Reform 
Our nation is becoming "increasingly dependent on statistics for policy analysis and 
decision making" (Andrew, 1984, p. 51); "school reform has riveted national attention on the 
numbers" (Hanford & White, 1991). Reform initiatives throughout the educational system are 
shifting the focus toward outcomes and quantifiable data. With increasing frequency, the data 
needed to monitor and evaluate education reform activities are being drawn from state and national 
data collection programs. 

During the current wave of reform, state education agencies (SEAs) are being asked to do 
more than just keep track of the number of students enrolled or how much money was spent per 
pupil. SEAs are being pushed to look at the outcomes achieved by students within their 
educational systems. This trend is evident in the move toward publishing state comparisons from 
the National Assessment of Educational Progress (NAEP) Trial State Assessment, the Scholastic 
Aptitude Test (SAT), and others. It is also evident in the increased number of reports like those 
published by the Council of Chief State School Officers (CCSSO), which describe how states are 
doing in various aspects of education. There is clearly a press for policy-relevant information 
about the performance of students in our educational system. 

In addition to the general education reform movement, recent state and national reform 
initiatives in special education (Skrtic, 1991) have resulted in increased interest in outcomes 
information. Since the passage of PL 94-142 in 1975, there has been more than a decade of 
evaluation studies that have focused primarily on the issue of educational access for students with 
disabilities and implementation of the processes embodied in the law. Increasingly the question of 
"where's the beef?" has been asked from both within and outside of special education. Focus has 
recently turned toward evaluating the outcomes of special education, or, "where are the data?" on 
effectiveness (DeStefano & Wagner, 1991). 
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Current State Activities 

In NCEO surveys of state activities in the assessment of educational outcomes (NCEO, 
1992, 1993, 1994) we found that there are only a few state-level special education data collection 
efforts, other than special post-school status studies, that regularly gather outcome data on students 
with disabilities. Most state outcomes information is generated from large-scale general education 
assessment programs in which some students with mild disabilities may participate. Thus, the 
only potentially useful source of outcomes data for students with disabilities that might be 
aggregated across states are the large-scale general education assessment programs. In particular, 
given that almost 90% of all states collect some form of achievement data (NCE^, 1994), the 
secondary analysis of state achievement outcome information might produce useful information on 
the achievement outcomes of some students with disabilities. 

In this report we focus on the feasibility of aggregating achievement outcome information 
across large-scale state general education assessment programs. Our original purpose was to 
produce policy-relevant reports on the educational status of students with disabilities. 

Method 

Sample 

In the Spring of 1991, state directors of special education or their designees responded to 
the annual NCEO national survey of state special education outcomes activities (NCEO, 1992). 
This survey was used to gather information on state efforts in the areas of federally-reported data, 
assessment of outcomes, inclusion of students with disabilities in state assessments, state 
assessment needs and highlights, activities in selected outcome areas, and practices, programs, and 
plans related to outcomes. 

In the initial annual survey, 49 of the SO states reported that some students with disabilities 
took part in their general education large-scale achievement assessments. These state assessments 
typically varied from the administration of nationally-normed commercial achievement tests (e.g., 
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Stanford Achievement Test) to state-developed norm-referenced or minimum competency exams. 
Slightly more than half of the 50 states (n=27; 54%) indicated that students with disabilities could 
be identified in their da a sets. In other words, some variable was present in the state data base that 
indicated each student's special education status. These 27 states were the initial sample selected 
for inclusion in the current investigation. 
Data Gathering Procedures 

Individual follow-up phone calls were made by the NCEO staff to the 27 identified states to 
inquire about the possibi'ity of the state providing a copy of its large-scale achievement data base to 
the NCEO. The individuals contacted were those working in state divisions or departments 
responsible for collecting the large-scale assessment data. While some state personnel were 
working within the state Special Education division and some within the General Education 
division, most states had separate divisions under the Department of Education umbrella that were 
designated as responsible for the large-scale student data collection program. The individuals 
contacted were from departments with titles such as: Pupil Accountability; Assessment, Testing 
and Evaluation; State Testing and Evaluation Center; Assessment and School Information, Special 
Programs Division; Division of Accountability; Student Performance Assessment; Bureau of 
Statewide Assessment; Special Education Services; and the Division of Research, Evaluation, and 
Assessment 

For states that indicated a willingness to provide the NCEO a copy of their data files, 
follow-up calls were completed to ask specific questions about cost, type of computer format and 
medium, and the time it would take for the NCEO to acquire the data For those states that 
responded positively to this initial contact, a formal letter was sent requesting a copy of the relevant 
computer data files. The NCEO request described the purpose of the activity and the data privacy 
safeguards that would operate during the NCEO's use of the data. All states were assured that no 
NCEO reports based on the analyses of the data would identify their state, and that the primary 
focus would be to aggregate data across states. 



Technical Report 10 

Upon receipt of each state's data files, the files were converted (if necessary) to a usable 
format Descriptive analyses and file verification runs (Fortune & McBee, 1984) were completed 
for each data set to confirm the accuracy of the data and to determine the degree of confidence that 
could be placed in the data contained in each data file. Information regarding each state data base 
was sought in the areas of; 

♦ special education categories used to identify students with disabilities 

♦ grades assessed 

♦ domains assessed (e.g., reading, mathematics, etc.) 

♦ total sample size and size of subsamples of students with disabilities 

♦ type of assessment (norm referenced or minimum competency) 

♦ metric or scale used to report the assessment results 

Results 

Response to NCEO Request 

Of the 27 states originally identified as having potentially useful data for secondary 
analysis, the NCEO was able to secure copies from only six states.(n=6; 22%). The ability to 
secure data from 6 of the 50 states reflects a success rate of only 12%. The reasons the NCEO was 
unable to secure data files from the other 21 states are summarized in Table 1. 

Although most of the identified states included students with disabilities to some extent in 
their statewide testing programs, personnel in six states indicated that they were not able reliably to 
identify and disaggregate the data for these students. Personnel in five states simply did not 
respond to the NCEO's repeated requests. Contrary to the information provided to the NCEO 
during the annual survey of states, personnel in three additional states indicated that no 
achievement data had been collected Usually, this discrepancy resulted when the respondent to 
the original state survey interview indicated that such data were available, but follow-up with the 
state person with direct responsibility for collecting the data indicated that the original survey 
information provided was not accurate. Personnel in three states indicated that their data were only 
available at an aggregated state level, and one state voiced a concern about confidentiality as the 
reason for not sharing its data. For one state, the cost of securing a copy of the file was 
prohibitive. Finally, although two additional states provided data files, they were found to be 
either unreadable or, as a result of data verification procedures, were suspected to contain errors. 

5 

10 

ERLC 



Table 1 

Reasons Whv Data Base Files Could Not Be Obtained for 21 of 27 States 



Reason 


Number of states 


Unreliable or no coding of students 


6 


with disabilities in data file 




Unieponsive to NCEO requests 


5 


No achievement data gathered for students 


3 


with disabilities 




Aggregate and not individual data available 


3 


Data file was unreadable or appeared to 


2 


contain errors 




Confidentiality concerns expressed by state 


1 


State wanted to charge an excessive 


1 


acquisition cost 





Analysis of Received Data Bases 

The computer data files and related documentation provided by the six states were reviewed 
to ascertain the degree to which secondary data analysis of the aggregated data was feasible. The 
first analysis focused on the type of assessment information that was available across grades. A 
summary of the six states by the academic domain assessed, grade level assessed, and type of 
assessment (norm-referenced or minimum competency tests) is presented in Table 2. A review of 
Table 2 indicates that all state data bases included information in the academic domains of reading 
and mathematics. Four of the six state data bases included ^formation about writing or language. 
One state data base included information about other academic areas (e.g., social studies, science). 

Five of the six state-provided data sets included scores from state-specific minimum 
competency (MC) tests. Aggregation across states within grade levels was determined not to be 
feasible given that even if the most common assessment format (MC) was used, at best, data were 
available for only 1 to 2 states at any specific grade. This reflects only 2% to 4% of all 50 states. 

Further complicating any potential aggregated secondary analysis was the finding that each 
state's MC test had its own unique scaled score, and that these were not comparable across states. 
In addition, of the two states that provided norm-referenced (NR) scores that allowed for relative 
standing comparisons (e.g., percentile ranks), one provided scores based on a national norm group 
while the other provided locally normed scores based on over 60 different assessment tasks. The 
use of data from assessment instruments based on two different types of norm-referenced groups 
and four different minimal competency scales presents an almost impossible situation in any 
attempt to aggregate or informally compile results across the six states. 

A second complication for the aggregation of data across states was inconsistency in the 
identification of students with disabilities in the state data bases and the exclusion of many students 
with disabilities. This became evident when the most optimal aggregation strategy was examined 
Based on the results summarized in Table 2, it was determined that the largest amount of 
information possible (5 states) would be to aggregate MC results (most likely percent of students 
with disabilities above and below each state's MC criterion score) in reading and/or mathematics 



Table 2 

Analysis of Received Data Bases hv Assessment Domains. Grade 
Level Assessed, and Type of Assessment 



Writing/ Number 
Reading Mam Language Other of States 

Grade NR MC NR MC NR MC NR MC NR MC 



3 


A 




A 




A 






1 


0 


4 




BE 




BE 




E 


E 


0 


2 


6 




F 




F 




F 




0 


1 


7 




B 




B 








0 


1 


8 


A 


DE 


A 


DE 


A 


E 


E 


1 


2 


10 


C 


BC 


C 


BC 




C 




1 


2 


11 




E 




E 




E 


E 


0 


1 


12 


A 




A 




A 






1 


0 


Number of 
States 


2 


5 


2 


5 


1 


3 


0 1 







Note : Letters represent individual states (A-F). NR = norm-referenced; MC = minimum 
competency. State "F also provided data for grades 7 and 8. However, the students in 
these grades were students who had failed the minimum competency exam in grade 6, and thus 
represented biased samples. 
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for students with mild disabilities. Often only a portion of the student population, usually those 
students with mild disabilities, is included in large-scale state and national assessments. Thus, the 
identification of students with mild disabilities (which comprise approximately 80% to 90% of the 
student disability population) (Reschly, 1987) in the state MC reading and math data bases was 
examined. The results are presented in Table 3. 

A review of Table 3 indicates that for each of the four federal disability categories examined 
(viz., learning disability, mental retardation, speech impairment, serious emotional disturbance), 
only one state data base provided for the identification of each disability at most grade levels. Two 
states provided for the identification of the four disability categories at grade 10. Two state data 
base files included no categorically based disability variables. 

Even the correspondence between the two states at grade 10 does not insure comparability 
of identified groups across states since some states provide for differentiation within categories 
(e.g., mental retardation) by level of disability (e.g., educable, trainable, severe), while others use 
a global mental retardation category. Differences in the operationalization of similar variables in 
different data bases is a problem frequently encountered in secondary data analysis (Kiecolt & 
Nathan, 1985). Even if these problems were ignored, the production of outcome reports for 
students with disabilities at any specific grade level would be based on only 1 or 2 of the 50 states. 
Generalizing to all 50 states from less than four percent of the states is very problematic. 

Even if all the above problems were ignored and an attempt was made to aggregate all MC 
results collapsed across all grades and all special education disability categories, serious problems 
in the representativeness of the results are present. Based on either the sample size documentation 
provided by each state (combined with the annual state special education child count) or the results 
reported for the six states in the annual state survey, estimates were made of the proportion of the 
student population with disabilities that was excluded from the data bases. It is estimated that of 
the five state MC data bases reviewed, most include only 1/4 to 1/2 of each state's student 
population with disabilities. As is often the case in secondary data analysis, sample comparability 
would be a major concern. In any attempt to aggregate or compare the data from different state data 
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Table 3 

Number of States that Provided Reading and Mathematics Minimum Competency Data that 
Identified Students in Four "Mild" Disability Categories bv Five Grade Levels 



Grade 




Disability Category 




Learning 
Disability 


Mental 
Retardation 


Speech 
Impairment 


Serious 
Emotional 
Disturbance 


4 


1 


1 


1 


1 


6 


1 


1 


1 


1 


7 


1 


1 


1 


1 


8 


1 


1 


1 


1 


9 


2 


2 


2 


2 



Note. The information in this table comes from five states (B, C, D, E, F) referred to in Table 2. 
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files, significant and variable rates of exclusion of students with disabilities would be found The 
pooling of small subsamples from large independent data collection programs (in this case state 
data bases) does not guarantee a representative sample and often results in significantly increased 
sampling error (Kiecolt & Nathan, 1985). Generalization to the population of students with 
disabilities in each state, let alone for the nation, would be prone to serious error. 

Discussion 

The production of recurring, informative, and credible policy-relevant information on the 
achievement outcomes of students with disabilities from the secondary analyses of recurring data in 
large-scale state assessment programs currently is not possible. This conclusion is similar to that 
reached when attempts have been made to conduct secondary analyses of national data collection 
programs (McGrew, Algozzine, Spiegel, Thurlow, & Ysseldyke, 1993; McGrew, Spiegel, 
Thurlow, & Kim, 1994; McGrew, Thurlow, Shriner, & Spiegel, 1992; McGrew, Thurlow, & 
Spiegel, 1993). Given that many of the current education reform activities use measurable 
indicators from large-scale assessments as the index of progress, the evaluation of the education of 
most students with disabilities is being short-changed 

In the current investigation we found that although over half of the 50 states reported the 
availability of large-scale achievement data on some students with disabilities, we were only able to 
secure potentially usable data from six (12%) states. Numerous difficulties were encountered in 
obtaining large-scale assessment data bases from states that included some students with 
disabilities. These problems included simple nonresponse to requests for data, concerns about 
confidentiality, computer files with suspect or unreadable data, excessive acquisition costs, and 
unreliable identification of students with disabilities in the data bases. 

Secondary analysis of the limited number of state data bases that were received was deemed 
inappropriate due to problems with (a) sparse data at individual grade levels even after aggregation, 
(b) noncomparability of types of data (national vs. local norm-referenced scores; state-specific 
minimum competency scales), (c) variable or no identification of student disability characteristics 
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across data bases, and (d) significant and variable exclusion of large proportions of students with 
disabilities in the large-scale state assessments. 

The conclusions reached in this report should be not construed as a general indictment of 
most state assessment activities. It is important to recognize that the problems encountered in this 
investigation are due to attempts to use data bases originally developed for a different purpose. 
Most large-scale state assessment programs provide extremely important, reliable, and valid 
information for general education state-level analyses and decision making. Large-scale state 
assessment programs are designed and operated to meet the unique needs of each state. They 
typically are not designed or documented to meet the needs of independent researchers who wish to 
conduct secondary data analyses, especially aggregated analyses across a number of states. Still, 
improvements are possible in large-scale state assessment programs in the areas of greater 
inclusion of students with disabilities and the identification of these students in the final data bases. 

Some might argue that although limited in scope, cautious analyses of the six state data 
bases secured by the NCEO might be informative. Such analysis of the obtained data could 
possibly produce statements such as: "X percent of a portion of 10th grade students identified with 
learning disabilities in two states demonstrated minimum competency in reading achievement (as 
defined differently by each state). However, caution must be exercised in generalizing to all states 
since data were available for only two states (4% of all states), only a portion of students with 
learning disabilities (most likely the highest functioning students in this category) were included in 
the analyses, different proportions of all students with learning disabilities were included by each 
state, and the sampling error in the pooled data set is unknown and may be very large." 

We believe that under conditions of national importance, policy decisions should be made 
on the basis of information that is believable. Statements such as the one above would do little to 
instill confidence in the results and conclusions. In fact, such statements would most likely 
generate more arguments about the accuracy of the results, a discussion that would detract from the 
more important dialogue needed around the educational policy issues that were the focus of the 
original research questions. Furthermore, the production of such information is simply bad 
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science and cannot be encouraged Although many compromises are inevitable in secondary data 
analysis (McGrew, Spiegel, Thurlow, Ysseldyke, Bruininks, & Shriner, 1992), compromise of 
good scientific standards is not, particularly if the data are to be used to develop public policy 
(Bailar, Roger, & Passel, 1982). "It seems preferable to accomplish less with appropriate data 
than it is to reduce the study's credibility with caveats" (Fortune & McBee, 1984, p.40). 

Finally, "given the magnitude of federal and state support for educational programs for 
students with disabilities, support that reflects the valuing of this population in our society, it is 
time that this implied value be matched by the commitment of resources to address the numerous 
political and technical hurdles that must be overcome in order to be able to extract useful and 
routine information on the educational and quality of life outcomes for individuals with disabilities" 
(McGrew, Algozzine, Spiegel, Thurlow, & Ysseldyke, 1993, p. 11). Although currently it is not 
possible to produce routine, quality information regarding the educational outcomes of students 
with disabilities through the secondary analyses of data gathered through large-scale state 
assessment programs, this does not mean this approach should be discarded. 

Toward the goal of improving the collection and reporting of information from analysis of 

large-scale state assessment data bases, we offer the following "starting points" for consideration: 

1. The most important steps that can be taken are not those that focus on secondary data 
analysis issues, but steps that would improve the quality of data available on students with 
disabilities for each state. The implementation of four suggestions within states would go a 
long way to insuring more and better state data for evaluating the progress of students with 
disabilities within each state. These suggestions are: 

a. Increase the inclusion of students with disabilities in state data collection programs. 
This can be done by first increasing adherence to existing guidelines for inclusion of 
students with disabilities. A second step would be the development of broader and 
more uniform assessment eligibility guidelines and increased use of assessment 
modifications for certain students (McGrew, Thurlow, Shriner, & Spiegel, 1992). 
Recommended guidelines for inclusion and test accommodations are described in detail 
in a separate NCEO report (Y sseldyke, Thurlow, McGrew, & Vanderwood, 1994). 

bo Include in the background information questionnaire used to collect data on students 
who participate in a state's large-scale assessment program, additional variables that 
would better describe those students with disabilities who are included and excluded. 
An example set of possible variables has been developed and are presented in a separate 
NCEO report (McGrew, Algozzine, Spiegel, Thurlow, & Ysseldyke, 1993). This 
additional information would help to determine the generalizability of the data from the 
students with disabilities who participated in a state assessment to all students with 
disabilities in specific categories within the state. 
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c. Increase the consistency of the identification of students with disabilities in the final 
computer data files. States that currently do not allow for the identification of students 
with specific disabilities in their final data sets should consider adding such variables to 
their data files. 

d. Consider expanding the recurring state data collection programs to include other 
outcome domains besides academic achievement Important outcome information in 
such domains as personal and social adjustment, responsibility and independence, 
physical health, contribution and citizenship, and satisfaction would provide a more 
comprehensive picture of the status of all children. More importantly, assessments in 
many of these non-achievement domains would not consist of paper-and-pencil "tests," 
but can be gathered through other methods such as administrative record reviews and 
third-party informants (e.g., parent and teacher surveys). For example, many large- 
scale national assessment programs directed by the National Center for Education 
Statistics (NCES) and the National Center on Health Statistics (NCHES) make routine 
use of these data gathering methods. In many cases, data can be gathered for almost all 
students with disabilities on the relevant measures since actually completing a test or 
survey independently by the student is not required States are encouraged to review 
the NCEO's comprehensive conceptual models of outcome domains and indicators that 
address many of these domains at different points during a student's development 
(e.g., early childhood; grade 4, 8, and 12, post-school). 

2. The second set of suggested steps are those that would increase the probability of 

conducting secondary analysis of aggregated state data base information for students with 
disabilities. These general suggestions include: 

a. Initiate a dialogue among appropriate state assessment personnel (e.g M state data 
managers) on the feasibility of using a common set of data gathering and reporting 
strategies, guidelines, and/or standards that might produce more common or related 
data elements specific to students with disabilities across state assessment programs. 
Cooperative efforts similar to those that, produced the -Standards for Education Data 
Collection and Reporting (SEDCAR) (NCES, 1991) might be particularly worthwhile. 

b. For states that include their stated disability-specific categorical variables in their data 
bases, disability variables that often differ from the federal special education categories, 
methods should be explored that would allow for the development of "cross-walk" 
procedures for the conversion of state disability variables to the approximate federal 
categories. Increasing the number of states that can provide state-to-federal disability 
specific variable conversions would increase the feasibility of producing aggregated 
state reports. 

c. As mentioned in the first suggestion, states should explore the advantages of adopting 
the inclusion and assessment accommodation guidelines for students with disabilities 
that have been developed by the NCEO in cooperation with other state and national 
groups and individuals. Similar consideration should be given to including the 
additional background variables for describing students with disabilities in large-scale 
assessment programs. The increased adoption of these suggested methods and 
procedures, with or without state-specific modifications, would increase the 
comparability of the resulting samples of students with disabilities across state data 
collection programs, an important issue in any future attempt to conduct secondary 
analyses of aggregated state data. 
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